NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

TRANSFORMER EXPLAINER: Interactive Learning of Text-Generative Models

https://doi.org/10.1609/aaai.v39i28.35347

Cho, Aeree; Kim, Grace C; Karpekov, Alexander; Helbling, Alec; Wang, Zijie J; Lee, Seongmin; Hoover, Benjamin; Chau, Duen_Horng Polo (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Transformers have revolutionized machine learning, yet their inner workings remain opaque to many. We present TRANSFORMER EXPLAINER, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model. Our tool helps users understand complex Transformer concepts by integrating a model overview and smooth transitions across abstraction levels of math operations and model structures. It runs a live GPT-2 model locally in the user’s browser, empowering users to experiment with their own input and observe in real-time how the internal components and parameters of the Transformer work together to predict the next tokens. 125,000 users have used our open-source tool at https://poloclub.github.io/ transformer-explainer/.
more » « less
Free, publicly-accessible full text available April 11, 2026
Benchmark of DNN Model Search at Deployment Time

https://doi.org/10.1145/3538712.3538725

Zhou, Lixi; Jain, Arindam; Wang, Zijie; Das, Amitabh; Yang, Yingzhen; Zou, Jia (July 2022, SSDBM '22: Proceedings of the 34th International Conference on Scientific and Statistical Database Management)

Deep learning has become the most popular direction in machine learning and artificial intelligence. However, the preparation of training data, as well as model training, are often time-consuming and become the bottleneck of the end-to-end machine learning lifecycle. Reusing models for inferring a dataset can avoid the costs of retraining. However, when there are multiple candidate models, it is challenging to discover the right model for reuse. Although there exist a number of model-sharing platforms such as ModelDB, TensorFlow Hub, PyTorch Hub, and DLHub, most of these systems require model uploaders to manually specify the details of each model and model downloaders to screen keyword search results for selecting a model. We are lacking a highly productive model search tool that selects models for deployment without the need for any manual inspection and/or labeled data from the target domain. This paper proposes multiple model search strategies including various similarity-based approaches and non-similarity-based approaches. We design, implement and evaluate these approaches on multiple model inference scenarios, including activity recognition, image recognition, text classification, natural language processing, and entity matching. The experimental evaluation showed that our proposed asymmetric similarity-based measurement, adaptivity, outperformed symmetric similarity-based measurements and non-similarity-based measurements in most of the workloads.
more » « less
Full Text Available
Modular machine learning for Alzheimer's disease classification from retinal vasculature

https://doi.org/10.1038/s41598-020-80312-2

Tian, Jianqiao; Smith, Glenn; Guo, Han; Liu, Boya; Pan, Zehua; Wang, Zijie; Xiong, Shuangyu; Fang, Ruogu (December 2021, Scientific Reports)
null (Ed.)
Abstract Alzheimer's disease is the leading cause of dementia. The long progression period in Alzheimer's disease provides a possibility for patients to get early treatment by having routine screenings. However, current clinical diagnostic imaging tools do not meet the specific requirements for screening procedures due to high cost and limited availability. In this work, we took the initiative to evaluate the retina, especially the retinal vasculature, as an alternative for conducting screenings for dementia patients caused by Alzheimer's disease. Highly modular machine learning techniques were employed throughout the whole pipeline. Utilizing data from the UK Biobank, the pipeline achieved an average classification accuracy of 82.44%. Besides the high classification accuracy, we also added a saliency analysis to strengthen this pipeline's interpretability. The saliency analysis indicated that within retinal images, small vessels carry more information for diagnosing Alzheimer's diseases, which aligns with related studies.
more » « less
Full Text Available
Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks

https://doi.org/10.1109/VIS47514.2020.00061

Das, Nilaksh; Park, Haekyu; Wang, Zijie J.; Hohman, Fred; Firstman, Robert; Rogers, Emily; Chau, Duen Horng (October 2020, IEEE Visualization Conference (VIS))

Full Text Available
Massif: Interactive Interpretation of Adversarial Attacks on Deep Learning

https://doi.org/10.1145/3334480.3382977

Das, Nilaksh; Park, Haekyu; Wang, Zijie J.; Hohman, Fred; Firstman, Robert; Rogers, Emily; Chau, Duen Horng (April 2020, Conference on Human Factors in Computing Systems)

Full Text Available
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Srivastava, Aarohi; Rastogi, Abhinav; Rao, Abhishek; Shoeb, Abu Awal; Abid, Abubakar; Fisch, Adam; Brown, Adam R.; Santoro, Adam; Gupta, Aditya; Garriga-Alonso, Adri; et al (January 2023, Transactions on machine learning research)

Full Text Available

Search for: All records